Concept-Based Data Classification in Relational Databases
نویسندگان
چکیده
Data classification is a process which groups objects with common properties into classes and produces a classification scheme over a set of data objects. Data classification is useful for understanding and organizing database data and building hierarchical schemes in databases. We investigate data classification in relational databases and develop a method for data classification by concept-based generalization. Our method applies an attribute-oriented generalization technique which utilizes the knowledge about data concepts, integrates a data classification process with relational operations, and provides an efficient way for classification of data in relational databases. The characteristics of each class can be extracted automatically in the classification process. Moreover, quantitative information can be registered in the generalization process to assist the classification of data based on database statistics. Our analysis of the classification algorithms shows that the attribute-oriented approach substantially reduces the complexity of data classification in large databases.
منابع مشابه
An Ilp - Based Concept Discovery System for Multi - Relational Data Mining
AN ILP-BASED CONCEPT DISCOVERY SYSTEM FOR MULTI-RELATIONAL DATA MINING Kavurucu, Yusuf Ph.D., Department of Computer Engineering Supervisor : Asst. Prof. Dr. Pınar Şenkul July 2009, 118 pages Multi Relational Data Mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. However, as patter...
متن کاملManaging Hierarchies and Taxonomies in Relational Databases
The need to maintain classification and retrieval mechanisms that rely on concept hierarchies is as old as language itself. Familiar examples include the Dewey decimal classification system used in libraries and the system for classifying life forms developed in the 1700s by Carolus Linnaeus. A more recent example is Yahoo’s subject taxonomy. Information technology has led to an explosive growt...
متن کاملConfidence-based Concept Discovery in Multi-Relational Data Mining
Multi-relational data mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. Several relational knowledge discovery systems have been developed employing various search strategies, heuristics, language pattern limitations and hypothesis evaluation criteria, in order to cope with intract...
متن کاملMetadata Enrichment for Automatic Data Entry Based on Relational Data Models
The idea of automatic generation of data entry forms based on data relational models is a common and known idea that has been discussed day by day more than before according to the popularity of agile methods in software development accompanying development of programming tools. One of the requirements of the automation methods, whether in commercial products or the relevant research projects, ...
متن کاملVisualization Schemas for Flexible Information Visualization
Relational databases provide significant flexibility to organize, store, and manipulate an infinite variety of complex data collections. This flexibility is enabled by the concept of relational data schemas, which allow data owners to easily design custom databases according to their unique needs. However, user interfaces and information visualizations for accessing and utilizing databases have...
متن کامل